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Abstract — We consider the problem of utility optimal schedul- 
ing in general processing networks with random arrivals and 
network conditions. These are generalizations of traditional data 
networks where commodities in one or more queues can be 
combined to produce new commodities that are delivered to other 
parts of the network. This can be used to model problems such as 
in-network data fusion, stream processing, and grid computing. 
Scheduling actions are complicated by the underflow problem that 
arises when some queues with required components go empty. 
In this paper, we develop the Perturbed Max-Weight algorithm 
(PMW) to achieve optimal utility. The idea of PMW is to perturb 
the weights used by the usual Max- Weight algorithm to "push" 
queue levels towards non-zero values (avoiding underflows). We 
show that when the perturbations are carefully chosen, PMW is 
able to achieve a utility that is within 0(1/ V) of the optimal 
value for any V > 1, while ensuring an average network backlog 
of 0(V). 

Index Terms — Dynamic Control, Processing Networks, Data 
Fusion, Lyapunov Analysis, Stochastic Optimization 



I. Introduction 

Recently, there has been much attention on developing opti- 
mal scheduling algorithms for the class of processing networks 
e.g., ID, El, 0, flU, J5]. These networks are generalizations 
of traditional data networks. Contents in these networks can 
represent information, data packets, or certain raw materials, 
that need to go through multiple processing stages in the 
network before they can be utilized. One example of such 
processing networks is the Fork and Join network considered 
in which models, e.g., stream processing (6) (7) and grid 
computing JS]. In the stream processing case, the contents 
in the network represent different types of data, say voice 
and video, that need to be combined or jointly compressed, 
and the network topology represents a particular sequence 
of operations that needs to be conducted during processing. 
Another example of a processing network is a sensor network 
that performs data fusion |9), in which case sensor data must 
first be fused before it is delivered. Finally, these processing 
networks also contain the class of manufacturing networks, 
where raw materials are assembled into products |3l, |5l. 

In this paper, we develop optimal scheduling algorithms 
for the following general utility maximization problem in 
processing networks. We are given a discrete time stochastic 
processing network. The network state, which describes the 
network randomness (such as random channel conditions or 
commodity arrivals), is time varying according to some prob- 
ability law. A network controller performs some action at every 
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time slot, based on the observed network state, and subject 
to the constraint that the network queues must have enough 
contents to support the action. The chosen action generates 
some utility, but also consumes some amount of contents from 
some queues, and possibly generates new contents for some 
other queues. These contents cause congestion, and thus lead 
to backlogs at queues in the network. The goal of the controller 
is to maximize its time average utility subject to the constraint 
that the time average total backlog in the network is finite. 

Many of the utility maximization problems in data networks 
fall into this general framework. For instance, ifTOl . ifTTl . 
Ifl2l |fl3l . fl4l . can be viewed as special cases of the above 
framework which allow scheduling actions to be independent 
of the content level in the queues (see lfT31 for a survey of 
problems in data networks). By comparing the processing 
networks with the data networks, we note that the main 
difficulty in performing utility optimal scheduling in these 
processing networks is that we need to build an optimal 
scheduling algorithm on top of a mechanism that prevents 
queue underflows. Such scheduling problems with underflow 
constraints are usually formulated as dynamic programs, e.g., 
|[T6l . which require substantial statistical knowledge of the 
network randomness, and are usually very difficult to solve. 

In this paper, we develop the Perturbed Max-Weight al- 
gorithm (PMW) for achieving optimal utility in processing 
networks. PMW is a greedy algorithm that makes decisions 
every time slot, without requiring any statistical knowledge of 
the network randomness. PMW is based on the Max- Weight 
algorithm developed in the data network context IfTTl lfl8l . 
There, Max-Weight has been shown to be able to achieve 
a time average utility that is within 0(1 /V) of the optimal 
network utility for any V > 1, while ensuring that the average 
network delay is 0(V), when the network dynamics are i.i.d. 
HI 81 . The idea of PMW is to perturb the weights used in the 
Max- Weight algorithm so as to "push" the queue sizes towards 
some nonzero values. Doing so properly, we can ensure that 
the queues always have enough contents for the scheduling 
actions. Once this is accomplished, we then do scheduling as 
in the usual Max- Weight algorithm with the perturbed weights. 
In this way, we simultaneously avoid queue underflows and 
achieve good utility performance, and also eliminate the need 
to solve complex dynamic programs. 

The PMW algorithm is quite different from the approaches 
used in the processing network literature. H| analyzes manu- 
facturing networks using Brownian approximations. J2] applies 
the Max- Weight algorithm to do scheduling in manufacturing 
networks, assuming all the queues always have enough con- 
tents. |3|| develops the Deficit Max- Weight algorithm (DMW), 
by using Max-Weight based on an alternative control pro- 
cess for decision making. [4| formulates the problem as a 
convex optimization problem to match the input and output 
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rates of the queues, without considering the queueing level 
dynamics. PMW instead provides a way to explicitly avoid 
queue underflows, and allow us to compute explicit backlog 
bounds. Our algorithm is perhaps most similar to the DMW 
algorithm in (3). DMW achieves the desired performance by 
bounding the "deficit" incurred by the algorithm and applies 
to both stability and utility maximization problems. Whereas 
PMW uses perturbations to avoid deficits entirely and allows 
for more general time varying system dynamics, e.g., random 
arrivals and random costs. 

The paper is organized as follows: In Section [TT] we set up 
our notations. In Section [HI] we present a study on a data 
fusion example to demonstrate the main idea of the paper. 
In Section [IV] we state the general network model and the 
scheduling problem. In Section [V] we characterize optimality, 
and in Sections [VTl we develop the PMW algorithm and show 
its utility can approach the optimum. Section IVHI constructs 
a PMW algorithm for a more specific yet general network. 
Simulation results are presented in Section [Villi 

II. Notations 

Here we first set up the notations used in this paper: R 
represents the set of real numbers. R + (or R_) denotes the 
set of nonnegative (or non-positive) real numbers. R™ (or R™ ) 
is the set of n dimensional column vectors, with each element 
being in R (or R+). Bold symbols a and a T represent a 
column vector and its transpose, a >z b means vector a is 
entrywise no less than vector b. \\a — b\\ is the Euclidean 
distance of a and b. and 1 denote column vectors with all 
elements being and 1. For any two vectors a = (oi, a n ) T 
and b = (bi, 6„) T , the vector a®b= {a\b\, a n b n ) T . 
Finally [a] + = max [a, 0]. 

III. A DATA PROCESSING EXAMPLE 

In this section, we study a data fusion example and develop 
the Perturbed Max- Weight algorithm (PMW) in this case. This 
example demonstrates the main idea of this paper. We will later 
present our general model in Section ITVl 

A. Network Settings 

We consider a network shown in Fig. Q] where the network 
performs a 2-stage data processing for the data entering into 
the network. 
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Fig. 1 . An example network consisting of three queues qi,q2,<}3 and two 
processors Pi , P% . 

In this network, there are two random data streams 
Ri(t), R 2 (t), which represent, e.g., sensed data that come into 
sensors, or video and voice data that need to be mixed. We 
assume that Ri(t) = 1 or 0, equally likely, for i — 1,2. At 



every time slot, the network controller first decides whether 
or not to admit the new arrivals, given that accepting any one 
new arrival unit incurs a cost of 1. The controller then has 
to decide how to activate the two processors Pi , P 2 for data 
processing. We assume that both processors can be activated 
simultaneously. When activated, P\ consumes one unit of data 
from both q\ and q2, and generates one unit of fused data 
into g3. This data needs further processing that is done by 
Pi- When P2 is activated, it consumes one unit of data from 
q 3 , and generates one unit of processed data. We assume that 
each unit of successfully fused and processed data generates 
a profit of p(t), where p(t) is i.i.d. and takes value 3 or 1 
with equal probabilities. The network controller's objective is 
to maximize the average utility, i.e., profit minus cost, subject 
to queue stability. 

For the ease of presenting the general model later, we define 
a network state S(t) = (Ri(t), i?2(i)), Q which describes the 
current network randomness. We also denote the controller's 
action at time t to be x(t) = (D 1 (t),D 2 (t),Ii(t),I 2 (t)), 
where Dj(t) = 1 (Dj(t) = 0) means to admit (reject) the 
new arrivals into queue j, and Ii(t) = 1 (Ii(t) = 0) means 
processor Pi is activated (turned off). We note the following 
no-underflow constraints must be met for all time when we 
activate processors Pi,P 2 : 

h(t) < qi(t),h(t) < q 2 (t),I 2 {t) < q 3 (t). (1) 

That is, Ii(t) = 1 only when qi and q2 are both nonempty, 
and l2(t) = 1 only if q 3 is nonempty. Note that [3| is the first 
to identify such no-underflow constraints and propose explicit 
solution to the queue underflow problems for the context of 
a processing network. Subject to (|T), we can then write the 
amount of arrivals into qi,q 2 ,q 3 , and the service rates of the 
queues at time t as functions of the network state S(t) and 
the action x(t), i.e., 

A 3 (t) = A j (S(t),x(t)) = D j (t)R j (t), j = 1,2, 

A 3 (t) = A 3 (S(t),x(t))=h(t). (2) 

Hj(t) = nj(S(t),x(t))=h(t), j = 1,2, 

H 3 (t) = ti 3 (S(t),x(t))=I 2 (t). (3) 

Then we see that the queues evolve according to the following: 

q 3 (t + 1) = qj (t) - ^(t) + Aj(t), j = 1, 2, 3, Vi. (4) 

The instantaneous utility is given by: 

f{t) = f(S(t),x(t)) 

= p(t)I 2 {t) - D^R^t) - D 2 (t)R 2 (t). (5) 

The goal is to maximize the time average value of f(t) subject 
to network stability. 

Note that the constraint (Q]i greatly complicates the design of 
an optimal scheduling algorithm. This is because the decision 
made at time t may affect the queue states in future time slots, 
which can in turn affect the set of possible actions in the future. 

'The network state here contains just Ri(t) and R'2{t). More complicated 
settings, where the amount consumed from queues may also depend on the 
random link conditions between queues and processors can also be modeled 
by incorporating the link components into the network state, e.g., 1191 . 
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In the following, we will develop the Perturbed Max- Weight 
algorithm (PMW) for this example. The idea of PMW is use 
the usual Max- Weight algorithm, but to perturb the weights 
so as to push the queue sizes towards certain nonzero values. 
By carefully designing the perturbation, we can simultaneously 
ensure that the queues always have enough data for processing 
and the achieved utility is close to optimal. 

B. The Perturbed Max-Weight algorithm (PMW) 

We now present the construction of the PMW algorithm 
for this simple example (this is extended to general network 
models in Section IVTb. To start, we first define a perturbation 
vector = (0i,9 2 ,9 3 ) T and the Lyapunov function L(t) — 

one-slot conditional 

drift as: 

A(t)=E{L(t + l)-L(t)\q(t)}, (6) 

where the expectation is taken over the random network state 
S(t) and the randomness over the actions. Using the queueing 
dynamics @, it is easy to obtain that: 

3 

A(t) < B -$>{(<&(*) - O^it) - Aj(t)] | q(t)}, 
i=i 

where B = 3. Now we use the "drift-plus-penalty" approach 
in |fl8l to design our algorithm for this problem. To do so, 
we define a control parameter V > 1, which will affect 
our utility-backlog tradeoff, and add to both sides the term 
-VE{/(f) | q(t)} to get: 

A(i) - VE{f(t) | q(t)} (7) 
<B-VE{f(t)\q(t)} 

3 

-$>{(<&(*) - A M I </(*)}• 

3=1 

Denote A v (t) = A(t) - VE{f(t) \ q(t)}, and plug ©, © 
and (0 into the above, to get: 

Ay (t) < B + E{A (t)iJ x (t) [ ?1 (t) - 0i + V] | 9 (0 } (8) 
+E{Z> 2 (t)i? 2 (i)[g 2 (i) - 2 + V] | g(t)} 
-E{/ 2 (t)[ga(t)-fl 3 +p(t)V] | q(t)} 
-E{h(t)[ qi (t) -9 1+ q 2 (t) - 2 - (q 3 (t) - 3 )] | q(t)}. 

We now develop our PMW algorithm by choosing an action 
at every time slot to minimize the right-hand side (RHS) of 
(0 subject to (0. The algorithm then works as follows: 

PMW: At every time slot, observe S(t) and q(t), and do 
the following: 

1) Data Admission: Choose Dj(t) = 1, i.e., admit the new 
arrivals to qj if: 

q 3 (t)-d 3 +V <0, j = l,2, (9) 

else set Dj(t) = and reject the arrivals. 

2) Processor Activation: Choose I\(t) = 1, i.e., activate 
processor Pi, if qi(t) > 1, q 2 (t) > 1, and that: 

qi {t) -e, + q2 (t) -e 2 - (q 3 (t) - e 3 ) > o, (io) 



else choose Ii(t) = 0. Similarly, choose I 2 (t) = 1, i.e., 
activate processor P 2 , if q 3 (t) > 1, and that: 

q 3 (t)-6 3 +p(t)V>0, (11) 

else choose I 2 (t) = 0. 
3) Queueing update: Update qj(t), Vj, according to ©. 

C. Performance of PMW 

Here we analyze the performance of PMW. We will first 
prove the following important claim: under a proper 6 vector, 
PMW minimizes the RHS of (0 over all possible policies of 
arrival admission and processor activation, including those 
that choose actions regardless of the constraint (0. We 
then use this claim to prove the performance of PMW, by 
comparing the value of the RHS of (0 under PMW versus 
that under an alternate policy. 

To prove the claim, we first see that the policy that 
minimizes the RHS of (0 without the constraint (0 differs 
from PMW only in the processor activation part, where PMW 
also considers the constraints qi(t) > 1, q 2 (t) > 1 and 
l3(t) > 1- Thus if one can show that these constraints are 
indeed redundant in the PMW algorithm under a proper 9 
vector, i.e., one can activate the processors without considering 
them but still ensure them, then PMW minimizes the RHS of 
(0 over all possible policies. In the following, we will use the 
following Qj values: 

0i = 2V, 2 = 2V, 3 = 3V. (12) 

Let us now look at the queue sizes qj(t),j = 1,2,3. From 
(fTTT l. we see that P 2 is activated if and only if: 

qs(t) >e 3 -p(t)V + l, and q 3 (t) > 1. (13) 

Hence I 2 (t) = 1 whenever q 3 (t) > 3 - V + 1, but I 2 (t) = 
unless <?3(<) > 03 — 3U + 1. Since q 3 can receive and deliver 
at most one unit of data at a time, we get: 

03-F+l >q 3 (t) >0 3 -3V, Vt. (14) 

Using 03 = 3V, this implies: 

2V+1 >q 3 (t) >0, Vt. (15) 

This shows that with 03 = 3V, the activations of P 2 are always 
feasible even if we do not consider the constraint q 3 (t) > 1. 

We now look at qi(t) and q 2 (t). We see from (0 that for 
0i, 02 > V, we have: 

qj (t) <0 i -V, j = l,2. (16) 

Also, using ( [Tol l and (fl4l i. it is easy to see that when Ii(t) = 1, 
i.e., when Pi is turned on, we have: 

qi (t) - 0! + q 2 (t) - 2 > q 3 (t) - 3 > -3U. (17) 

Combining ( TTTb with (IT6b . we see that if I\ (t) = 1, we have: 

q 3 (t)>l, J -1,2. (18) 

This is so because, e.g., if qi(t) = 0, then qi(t) — 0i = — 0i = 
-2V. Since q 2 (t) - 9 2 < -V by ([TBI , we thus have: 

qi (t) - 0i + q 2 {t) - 2 < -2V - V = -3V, 
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which cannot be greater than — 3V in ( fTTT i. Thus by (fT5l l and 
([TSJ, we have: 

«&(*)> 0, j = l,2,3,Vt. (19) 

This shows that by using the values in (fTZt . PMW auto- 
matically ensures that no queue underflow happens, and hence 
PMW minimizes the RHS of (0 over all possible policies. 

Given the above observation, the utility performance of 
PMW can now be analyzed as the usual Max- Weight algo- 
rithm. Specifically, using a similar argument as in we can 
compare the drift under PMW with a stationary randomized 
algorithm which chooses scheduling actions purely as a func- 
tion of S(t), and achieves E{/Zj(f) - A 3 (t) \ q(t)} = for 
all j and E{/(t) | q(t)\ = f* v = |, where f* v is the optimal 
average utility. Note that this comparison will not have been 
possible here without using the perturbation to ensure ( |79| >. 
Now plugging this policy into (0, we obtain: 

A(t) - VE{f(t) | q(t)} < B - Vf* av . (20) 

Taking expectations over q(t) on both sides and summing it 
over t = 0, 1, ...,T — 1, we get: 

T-l 

E{L(T) - L(0)} -VY, E{/(*)} < rB - VTf* v . (21) 
t=o 

Now rearranging the terms, dividing both sides by VT, and 
using the fact that L(t) > 0, we get: 

t=o 

Taking a liminf as T — > oo, and using E{L(0)} < oo, we 
get: 

/™ = lhninf i ][>{/(*)} ^ - f. (23) 

°° t=o 

where fa V MW denotes the time average utility achieved by 
PMW. This thus shows that PMW is able to achieve a time 
average utility that is within 0(1/ V) of the optimal value, 
and guarantees qj(t) < 0(V) for all time. Note that PMW 
is similar to the DMW algorithm developed in Q. However, 
DMW allows the queues to be empty when activating proces- 
sors, which may lead to "deficit," whereas PMW effectively 
avoids this by using a perturbation vector. 

In the following, we will present the general processing net- 
work utility optimization model, and analyze the performance 
of the general PMW algorithm under this general model. Our 
analysis uses a duality argument, and will be different from 
that in [5j. As we will see, our approach allows one to analyze 
the algorithm performance without proving the existence of an 
optimal stationary and randomized algorithm. 

IV. General System Model 

In this section, we present the general network model. We 
consider a network controller that operates a general network 
with the goal of maximizing the time average utility, subject 
to the network stability. The network is assumed to operate in 
slotted time, i.e., t £ {0, 1, 2, ...}. We assume there are r > 1 
queues in the network. 



A. Network State 

In every slot t, we use S(t) to denote the current network 
state, which indicates the current network parameters, such as 
a vector of channel conditions for each link, or a collection 
of other relevant information about the current network links 
and arrivals. We assume that S(t) is i.i.d. every time slot, 
with a total of M different random network states denoted by 
S = { Sl ,s 2 , . • • , %}. We let n Si = Pr{S(t) = s 4 }. The 
network controller can observe S(t) at the beginning of every 
slot t, but the 7T Sj probabilities are not necessarily known. 

B. The Utility, Traffic, and Service 

At each time t, after observing S(t) = Sj and the network 
backlog vector, the controller will perform an action x(t). 
This action represents the aggregate decisions made by the 
controller at time t, which can include, e.g., in the previous 
example, the set of processors to turn on, or the amount of 
arriving contents to accept, or both, etc. 

We denote X^ Si > the set of all feasible actions for network 
state si, assuming all the queues contain enough contents to 
meet the scheduling requirements. Note that we always have 
x(t) = for some x^- Si ^ £ whenever S(t) = s;. The 

set A?' s< ) is assumed to be time-invariant and compact for all 
Si <G S. If the chosen action x(t) = x^- Si ^ at time t can be 
performed, i.e., it is feasible and all the queues have enough 
contents, then the utility, traffic, and service generated by x(t) 
are as follows: 

(a) The chosen action has an associated utility given by the 
utility function /(£) = f(s. h x^) : h-> E; 

(b) The amount of contents generated by the action to 
queue j is determined by the traffic function Aj(t) = 
Aj(si,x^) : X^^ h-> R_|_, in units of contents; 

(c) The amount of contents consumed from queue j by 
the action is given by the rate function Hj(t) = 
p,j(si,x( Si ') : X^ Si ^ i — y R_|_, in units of contents; 

Note that Aj (t) includes both the exogenous arrivals from out- 
side the network to queue j, and the endogenous arrivals from 
other queues, i.e., the newly generated contents by processing 
contents in some other queues, to queue j. We assume the 
functions /(sj, •), Hj(si, •) and Aj(si, ■) are continuous, time- 
invariant, their magnitudes are uniformly upper bounded by 
some constant 5 max € (0, oo) for all Sj, j, and they are known 
to the network operator. 

In any actual algorithm implementation, however, we see 
that not all actions in the set X^ Si ^ can be performed when 
S(t) = Si, due to the fact that some queues may not have 
enough contents for the action. We say that an action x' Si ' £ 
XM is feasible at time t with S(t) — Sj only when the 
following general no-underflow constraint is satisfied: 

Qj(t) > ^( Si ,x^), V j. (24) 

That is, all the queues must have contents greater than or 
equal to what will be consumed. In the following, we assume 
there exists a set of actions {x^'^}^~^' 2, " A f +2 with xf £ 

2 Note that all our results can easily be extended to the case when S{t) 
evolves according to a finite state aperiodic and irreducible Markov chain, by 
using the results developed in 1201 . 
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and some variables > for all Sj and k with 

Efci? ^ = 1 for a11 s *> such that: 

r+2 

E^^E^" ^^'^ )-/*^*!,^" )]} < -V, (25) 

s, fc=l 

for some r} > for all j. That is, the "stability constraints" 
are feasible with 77-slackness. In the following, we use: 

A(t) = (A 1 (t),...,A r (t)f, f i(t) = ( f , 1 (t),..., f , r (t)) T , (26) 

to denote the arrival and service vectors at time t. 



C. Queueing, Average Cost, and the Objective 



Let q(t) = ( qi (t),...,q r (t)) q 



t = 0,1,2,... be 



the queue backlog vector process of the network, in units of 
contents. Due to the feasibility condition (1241 of the actions, 
we see that the queues evolve according to the following 
dynamics: 



qj (t+ 1) = q 3 (t) - m (t) + Aj (t) , Vj, t > 0, 



(27) 



with some ||qr(0)|| < 00. Note that using a nonzero qj(0) 
can be viewed as placing an "initial stock" in the queues to 
facilitate algorithm implementation. In this paper, we adopt 
the following notion of queue stability: 



j t-l r 

q = limsup-^^E{(7j(T)} < 00. 



(28) 



We also use f^ v to denote the time average utility induced by 
an action-choosing policy II, defined as: 



C 4liminfi^E{/ n (r)}, 



(29) 



where / n (r) is the utility incurred at time t by policy II. We 
call an action-choosing policy feasible if at every time slot t it 
only chooses actions from the feasible action set X^ s ^ that 
satisfy d24t . We then call a feasible action-choosing policy 
under which d28l ) holds a stable policy, and use f* v to denote 
the optimal time average utility over all stable policies. 

In every slot, the network controller observes the current 
network state and the queue backlog vector, and chooses a 
feasible control action that ensures d24l) . with the objective 
of maximizing the time average utility subject to network 
stability. Note that if condition ( f24b can be ignored, and if any 
processor only requires contents from a single queue, then this 
problem falls into the general stochastic network optimization 
framework considered in lfl8l . in which case it can be solved 
by using the usual Max- Weight algorithm to achieve a utility 
that is within 0(1/V) of the optimal while ensuring that the 
average network backlog is 0(V). 

3 The use of r + 2 actions here is due to the use of Caratheodory's theorem 
1211 in the proof of Theorem [T] 



V. Upper bounding the optimal utility 

In this section, we first obtain an upper bound of the optimal 
utility that the network controller can achieve. This upper 
bound will later be used to analyze the performance of our 
algorithm. The result is summarized in the following theorem. 

Theorem 1: Suppose the initial queue backlog q(t) satisfies 
E{<7j(0)} < 00 for all j = 1, r. Then we have: 

Vf* av <0*, (30) 
where <f>* is the optimal value of the following problem: 

r+2 



max: = E^E * /fai,**) (31) 

s; fc=l 
r+2 

s.t. ^^ Si ^4 Sl) ^( Sj ,x^ } ) (32) 



fe=i 



1 +2 



= E 7rs «E a fe W s i>4 s ) 



k=l 



>o,y Si ,k,J2 a k Si) = 1 >Vsi. 



(33) 
(34) 



Proof: See Appendix A. ■ 
Note that the problem (l3TT l only requires that the time average 
input rate into a queue is equal to its time average output rate. 
This requirement ignores the action feasibility constraint d24l ). 
and makes ( BTT l easier to solve than the scheduling problem. 
We now look at the dual problem of the problem d3lT ). The 
following lemma shows that the dual problem of (f3TT > does not 
have to include the variables {of^^}^Zi' 'l^ 2 - This lemma 
will also be useful for our later analysis. 

Lemma 1: The dual problem of PTT ) is given by: 



min : 3(7), s.t. 7 £ M. r , 
where the function #(7) is defined: 



(35) 



9(f) 



sup J2n Si {vf( Sl: x^) (36) 
-E^[^(^^ (s °)-Mi(^^ (si) )]}- 



Moreover, let 7* be any optimal solution of d35l ). we have: 

.9(7*) > </>*■ (37) 

Proof: (Lemma [TJ It is easy to see from OTI ) that the dual 
function is given by: 



K7) 



, r+2 

SUP E^O y2 a k' )V f( S ii'" 



(38) 



r+2 



fc=i 



Due to the use of the {0^}^— i'"'m" 2 variables, it is 
easy to see that 5(7) > .9(7)- However, if {x^)}*^ 
is a set of maximizers of 3(7), then the set of variables 

{x^^, a.k S ^}i=i , ' , Ai^ 2 where for each Sj, x^l = for 
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(s) (s) 

all k, and a\ = 1 with ai = for all k > 2, will also 
be maximizers of 17(7). Thus (7(7) > 5(7). This shows that 
.9(7) — 5(7)' an d hence 17(7) is the dual function of PTT ). 
d37l follows from weak duality ||2T1| . ■ 
In the following, it is useful to define the following function: 



<? Sj ( 7 ) = sup {Vf( 8i ,xM) 



(39) 



That is, g Si (7) is the dual function of OTb when there is a 
single network state Sj. We can see from ( 1361 and ( [39l that: 



(40) 



In the following, we will use 7* = (7*, ...,7*) T to denote an 
optimal solution of the problem 



VI. The perturbed max-weight algorithm and its 

PERFORMANCE 

In this section, we develop the general Perturbed Max- 
Weight algorithm (PMW) to solve our scheduling problem. To 
start, we first choose a perturbation vector 6 = (#1, ...,8 r ) T . 
Then we define the following weighted perturbed Lyapunov 
function with some positive constants {wj}j =1 : 



1 r 



(41) 



3=1 



We then define the one-slot conditional drift as in (0, i.e., 
A(t) = E{L(t + 1) - L(t) I q(t)}. We will similarly use 
the "drift-plus-penalty" approach in Section [HI] to construct 
the algorithm. Specifically, we first use the queueing dynamic 
equation ( 1271 . and have the following lemma: 

Lemma 2: Under any feasible control policy that can be 
implemented at time i, we have: 



A(i) - VE{f(t) I q(t)} < B — VE{f(t) I q(t)} 



(42) 



•^^(^(^-^{^(f)-^)] I q(t)}, 



where B = 5^ ax YJj=i ™j 



Proof: See Appendix B. ■ 
The general Perturbed Max-Weight algorithm (PMW) is 
then obtained by choosing an action x(t) from X^ s ^ at time 
t to minimize the right-hand side (RHS) of W2\ subject to ( l24l . 



Specifically, define the function D^^^ (x) as: 



= Vf( Si ,x) 



(43) 



w i ( q i (*) ~ B i ) [to ( Sl ' x ) ~ A i ' x )] ■ 

j'=i 



We see that the function D^^^ (x) is indeed the term inside 
the conditional expectation on the RHS of (l42l . We now 
also define D^^. to be the optimal value of the following 
problem: 



max: D^ [t) {x), s.t, x^ Si) € . 



Hence Z?i s, l*, is the maximum value of D^f % \.-, over all 
possible policies, including those that may not consider the no- 
underflow constraint (l24l . The general Perturbed Max- Weight 
algorithm (PMW) then works as follows: 

PMW: Initialize the perturbation vector 0. At every time 
slot t, observe the current network state S(t) and the backlog 
q(t). If S(t) = Si, choose x^ ] G subject to (El that 

makes the value of D^^^x) close to Dg^Zy 

Note that depending on the problem structure, the PMW 
algorithm can usually be implemented easily, e.g., 0, ifTTl . 
Now we analyze the performance of the PMW algorithm. We 
will prove our result under the following condition: 

Condition 1: There exists some finite constant C > 0, such 
that at every time slot t with a network state S(t), the value 
of D^tyl (x) under PMW is at least D^jf))* - C. 

The immediate consequence of Condition Q] is that PMW 
also minimizes the RHS of (l42l . i.e., the conditional expec- 
tation, to within C of its minimum value over all possible 
policies. If C = 0, then PMW simultaneously ensures d24l i and 
minimizes the RHS of (l42l , e.g., as in the example in Section 
ITTIl However, we note that Condition Q] does not require the 
value of D^^^x) to be exactly the same as -D^q'jj*- This 
allows for more flexibility in constructing the PMW algorithm 
(See Section lVTri for an example). We also note that Condition 
Q] can be ensured, e.g., by carefully choosing the 6j values 
to ensure qj(t) > S max for all time Q. We will show that, 
under Condition Q] PMW achieves a time average utility that 
is within 0(1/ V) of /*„, while guaranteeing that the time 
average network queue size is 0(V) + J2j w j®j' which is 
0{V) if = 9(V) and W] = 0(1), Vj. The following 
theorem summarizes PMW's performance results. 

Theorem 2: Suppose that d25l l holds, that Condition Q] 
holds, and that E{<Zj(0)} < 00 for all j = 1, ...,r. Then 
under PMW, we have: | 



PMW 



> f* 



B + C 
1^' 



-pmw B + C + 2V5 max . \ A n 
q < + 2^ w i 3 



V 



(45) 
(46) 



Here B 



~ ] max w j> V i s m e slackness parameter in 

Section [TV-BI f^ v M is defined in ( 1291 to be the time average 
expected utility of PMW, and q PMW is the time average 
expected weighted network backlog under PMW, defined: 

^AlimsupiSSwjE^Cr)}. 

t— ¥00 I n - 1 
T=0 3=1 

Proof: See Appendix C. ■ 
Theorem|2] shows that if Condition Q] holds, then PMW can be 
used as in previous networking problems, e.g., IfTTl . |[T2l . to 
obtain explicit utility-backlog tradeoffs. We note that a condi- 
tion similar to Condition Q] was assumed in |2J. However, 
only considers the usual Max- Weight algorithm, under which 
case (1241 may not be satisfied for all time. Whereas PMW 
resolves this problem by carefully choosing the perturbation 
vector. One such example of PMW is the recent work 0, 



(44) 4 Easy to see that (46) ensures )28K hence the network is stable under PMW. 



which applies PMW to an assembly line scheduling problem 
and achieves an [0(1/V),0(V)] utility-backlog tradeoff. 

VII. Constructing PMW for networks with 

OUTPUT REWARD 

In this section, we look at a specific yet general processing 
network model, and explicitly construct a PMW algorithm, 
including finding the proper 9 vector and choosing actions at 
each time slot. 

A. Network Model 

We assume that the network is modeled by an acyclic 
directed graph Q = (Q,V,C). Here Q = Q s U Q in is 
the set of queues, consisting of the set of source queues 
Q s where arrivals enter the network, and the set of internal 
queues Q m where contents are stored for further processing. 
V = V m U V° is the set of processors, consisting of a set of 
internal processors "P m , which generate partially processed 
contents for further processing at other processors, and output 
processors V°, which generate fully processed contents and 
deliver them to the output. C is the set of directed links that 
connects Q and V. Note that a link only exists between a 
queue in Q and a processor in P. We denote N™ = \V m \, 
N° = \V°\ and N p = N l p n + N°. We also denote AT* = \Q S \, 

N in = | Qin | and Nq = N s + N in_ 

Each processor P n , when activated, consumes a certain 
amount of contents from a set of supply queues, denoted by 
Q„, and generates some amount of new contents. These new 
contents either go to a set of demand queues, denoted by Q„ , 
if P n G P"\ or are delivered to the output if P„ G P°. For 
any queue qj G Q, we use to denote the set of processors 
that qj serves as a supply queue, and use Pj 3 to denote the set 
of processors that qj serves as a demand queue. An example 
of such a network is shown in Fig. [2] In the following, we 
assume that for each processor Pi G p ir \ |Q^| = 1 ; i.e., each 
processor only generates contents for a single demand queue. 

We use f3 n j to denote the amount processor P n consumes 
from a queue qj in (Q)^ when it is activated. For each Pi G p m , 
we also use to denote the amount Pi generates into the 
queue q^ if g% = Qf, when it is activated. For a processor 
Pk £ P°, we use ako to denote the amount of output generated 
by it when it is turned on. We denote /3 max = max^ /3jj, 
Pmin = mirijj Pij and a max = max, j t [a y , a io ]. We assume 
that (3 m in,P ma x,a max > 0. We also define M p to be the 
maximum number of supply queues that any processor can 
have, define M^j to be the maximum number of processors 
that any queue can serve as a demand queue, and define 
to be the maximum number of processors that any queue can 
serve as a supply queue. We use Rj (t) to denote the amount 
of contents arriving to a source queue qj G Q s at time t. We 
assume Rj(t) is i.i.d. every slot, and that Rj(t) < R max for 
all qj G Q s and all t. We assume that there are no exogenous 
arrivals into the queues in Q m . 

5 Note that here we only consider binary actions of processors. Our results 
can also be generalized into the case when there are multiple operation levels 
under which different amount of contents will be consumed and generated. 
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Ri 




Fig. 2. A general processing network. 

We assume that in every slot i, admitting any unit amount 
of Rj(t) arrival incurs a cost of Cj(t), and that activating any 
internal processor Pi G P m incurs a cost of Ci(t), whereas 
activating any output processor Pk G V° generates a profit of 
Pk (t) per unit output content. We assume Cj (t) , Ci (t) , pk (t) 
are all i.i.d. every time slot. In the following, we also assume 
that p m i n < pk{t) < Pmax, and that c mm < Cj(t) < c max 
and C m in < Ci(t) < C max for all k,j, i and for all time. 

Below, we use I n (t) = 1 to denote the activation decision 
of P n , i.e., I n (t) = 1 (J n (t) = 0) means that P n is activated 
(turned off). We also use Dj(t) G [0, 1] to denote the portion 
of arrivals from Rj(t) that are admitted into qj. We assume 
there exist some general constraint on how the processors 
can be activated, which can be due to, e.g., resource sharing 
among processors. We model this constraint by defining an 
activation vector I(t) = (Ii(t), ijv (t)), and then assume 
that I(t) G X for all time, where I denotes the set of all 
feasible processor activation decision vectors, assuming all the 
queues have enough contents for processing. We assume that 
if a vector J el, then by changing one element of J from one 
to zero, the newly obtained vector /' satisfies /' G 1. Note 
that the chosen vector I(t) must always ensure the constraint 
(1241) . which in this case implies that I(t) has to satisfy the 
following constraint: 

q 3 (t) > J»(*)A.i> Vj = l,..,r. (47) 

Under this constraint, we see that the queues evolve according 
to the following queueing dynamics: 

qj (t + 1) = qj (t) - J2 In(t)l3 nj + Dj{t)Rj{t), Vj G Q s , 

q 3 {t + 1) = q 3 {t) - In(t)Pnj + J »(t)a»i, ^ e ®" 

nePf nePf 

Note that we have used j G Q to represent qj G Q, and 
use n G V to represent P n G V in the above for notation 
simplicity. The objective is to maximize the time average of 
the following utility function: 

/(*) = W^Pk^ako - D^R^Cjit) (48) 

keV jGQ s 

6 This can be viewed as the difference between profit and cost associated 
with these processors. 
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( l48| > can be used to model applications where generating 
completely processed contents is the primary target, e.g., (5). 

B. Relation to the general model 

We see that in this network model, the network state, the 
action, and the traffic and service functions are given by: 

• The network state is given by: 

S(t) = ( Cj (t),j eQ s ,G\(t),ie P m , Pk (t),ke V°). 

. The action x(t) = (Dj(t),j G Q s ,I n (t),n G V). 

• The arrival functions are given by: 

Aj(t) = AjWMt)) = DjitJRjit), y qj e Q s , 
A j (t) = A j (S(t),x(t)) = Y 7 ™(*Kj> V 9j e Q m . 

• The service functions are given by: 

n£Pf 

Thus, we see that this network model falls into the general 
processing network framework in Section |IV] and Theorem [2] 
will apply in this case. Therefore, in the following, we will 
construct our PMW algorithm to ensure that Condition[T]holds. 

C. The PMW algorithm 

We now obtain the PMW algorithm for this general network 
in the following. We will look for a perturbation vector that 
is the same in all entries, i.e., 9 = 91. We first compute the 
"drift-plus-penalty" expression using the weighted perturbed 
Lyapunov function defined in d4TT i under some given positive 
constants {wj}^ =1 and some nonzero constant 9: 

A(t) - VE{f(t) | q(t)} < B (49) 

- Y EHfeW - 6] [ J2 UWnj - Rj(t)Dj(t)] | q(t)} 

jeQ s nePf 

- E{ Wj [ qj (t)-6][J2 InWnj 
j£Q in nePf 

- Ut)a nj ] | q(t)} 
-VE{ Y h{t)p k {t)a ko - Y Dj(t)Rj(t)cj(t) 



Here in the last term = Qf . We now present the PMW 
algorithm. We see that in this case the D^^)l (x) function is 
given by: 



k£V° 



Y mem i q (t)}. 



Here B 
where w 



N q (M= /3 max ) 2 + N" R 2 max + TV*™ (M g d a max ) : 



x = maxj Wj . We also denote w m i n 



mm, Wj. 



Rearranging the terms, we get the following: 

A(t) - VE{f(t) | q(t)} < B (50) 
+ J2 E{[V Ci (t) +«;,-(«&(*) - 9)]D 3 {t)R 3 {t) \ q(t)} 



Y E{l k (t)[ ]T w (q 3 (i)-d)(3 k0 +Vp k {t)a ko \ | q(t)} 

Y Hh(t)[ Y ^ (?»(*) ~ 6)Pu ~ M<lh(t) - 9)a lh 

-Vd(t)] I q(t)}. 



D 



E 



Y h{t)[ Y Wj{ qj {t)-6)p kj + Vp k {t)a ko } 

(51) 



kev° je ®s 

ii(t)[ Y w Mt)-Wa 

j'eQf 

-w h (q h (t) - 6)a ih - VCi(t)] 



■ E 



Our goal is to design PMW in a way such that under 



any network state S(t), the value of D^q^)}{x) is close 



to D^^)}* (x), which is the maximum value of D^^]}(x) 
without the underflow constraint ( |47b , i.e., 



D 



(S(t))< 
0,q{t) 



(x) 



D 



(S(t)) 
G,q(t) 



(x). 



Dj(t)6[0,l],/(t)£l 

Specifically, PMW works as follows: 

PMW: Initialize 9. At every time slot t, observe S(t) and 
q(t), and do the following: 

1) Content Admission: Choose Dj(t) = 1, i.e., admit all 
new arrivals to qj G Q s if: 

Vc J {t)+w J {q ] {t)-9)<Q, (52) 

else set Dj (t) = 0. 

2) Processor Activation: For each Pi £ P™, define its 
weight W} in) (t) as: 



(53) 



-w h [q h {t)-9]a th -VC l {t)} + , 



where qh — Qf. Similarly, for each P k G P°, define its 
weight Wjf\t) as: 



w t° ] (t) = [ Y "sin® - *ifly + ypk{tWo] + . (54) 



E 



Then, choose an activation vector I(t) from I to max- 
imize: 

Y h{t)wt\t)+ Y h(t)wt°Ht), (55) 

subject to the following queue edge constraints: 

a) For each Pi G f", set Ii{t) — 1, i.e., activate 
processor P,, only if: 

. qj (t) > M s q p rnax for all qj G Of, 
• q h (t) < 9, where q h = Qf . 

b) For each P k G P°, choose I k (t) = 1 only if: 
. q 3 {t) > M s q p max for all q 3 G Qf. 

The approach of imposing the queue edge constraints was 
inspired by the work 11221 . where similar constraints are 
imposed for routing problems. Note that if without these queue 
edge constraints, then PMW will be the same as the action that 
maximizes Dg q \ t \ (x) without the underflow constraint (1471 . 
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D. Performance 

Here we show that PMW indeed ensures that the value of 

-)OS(t)) 
J e, q (t) 



Also, since (TS6b only requires 6* = 0(V), and 



6(1) for 



-^0 g(t j (• T ) * s w i mm some additive constant of ^(^*(x). In 



the following, we assume that: 



> max [- 



Prnax\ 



(56) 



We also assume that the {wj}j =1 values are chosen such that 
for any processor Pi 6 V m with the demand queue q^, we 
have for any supply queue qj £ Qf that: 

Wjfiij > w h a ih . (57) 

We note that d56l l can easily be satisfied and only requires 
9 = 0(V). A way of choosing the {ujj}j =1 values to satisfy 
(f5Tb is given in Section IVII-EI Note that in the special case 
when fiij = aiij = 1 for all simply using Wj = 1, Vj 
meets the condition (l57l >. 

We first look at the queueing bounds. By (l52l l. qj admits 
new arrivals only when qj(t) < 9 — Vc m i n /wj. Thus: 

qj (t)<9- Vcmin/wj + R max , V qj e Q s , t. (58) 

Now by the processor activation rule, we also see that: 



o < Qj(t) <e + M a 



(59) 



This is because under the PMW algorithm, a processor is 
activated only when all its supply queues have at least 
MqPmax units of contents, and when its demand queue has 
at most 9 units of contents. The first requirement ensures that 
Qj(t) > for all time, while the second requirement ensures 
that qj(t) < 9 + MgCtmax' Below, by defining: 



Vmax — max [Mq 
we can compactly write 

< qj(t) <9 + u n 



x : Ftmax : fin 

and 



as: 



V q-j e Q, t. 



(60) 



(61) 



To prove the performance of the PMW algorithm, it suffices 
to prove the following lemma, which shows that Condition [T] 
holds for some finite constant C under the PMW algorithm. 

Lemma 3: Suppose (l56l l and d57l i hold. Then under 
PMW, D l *W>(x) > D { ^{x) - C, where C = 

NpW m ax Mp^max ftmax- 

Proof: See Appendix D. ■ 
We can now directly use Theorem [2] to have the following 

corollary concerning the performance of PMW in this case: 
Corollary 1: Suppose (EU, (E6j and (gT) hold. Then PMW 

achieves the following: 



PMW 



> f* 

— J (IV 



B + C 



q PMW < 



V ' 
B + C + 2V5 n 



3=1 

-PMW 



(62) 
(63) 



where C = N v w max M v v max fi max , f p v and q*'~" are 
the time average expected utility and time average expected 
weighted backlog under PMW, respectively. ■ 
Note that here S max can be chosen to be: 



[v m ax,N°p 



max'- '-inax 



all j, we see that PMW indeed achieves an [0(1/V),0(V)] 
utility-backlog tradeoff in this case. 

E. Choosing the {wj} T j=i values 

Here we describe how to choose the {wj} T j=i values to 
satisfy ( l57l l. We first let K be the maximum number of 
processors that any path going from a queue to an output 
processor can have. It is easy to see that K < \N p \ since there 
is no cycle in the network. The following algorithm terminates 
in K iterations. We use Wj(k) to denote the value of Wj at 
the k th iteration. In the following, we use q^ n to denote the 
demand queue of a processor P n . 

1) At Iteration 1, denote the set of queues that serve as 



supply queues for any output processor as 



,, i.e., 



AJ S f? n _l_ Af 171 /^ 1 

q ^max^max T" ^ ?? 



Qi = {?r^nr^}. 

Then set Wj(l) — 1 for each qj G Q\. Also, set Wj(l) = 
for all other qj £ Q[. 

2) At Iteration k = 2,...,K, denote Q[ to be the set of 
queues that serve as supply queues for any processor 
whose demand queue is in Q^^, i.e., 

Qi = { qj :3P n e Pf s.t. Q% E Qi_ x }. 
Then set: 

utj(k) = max \uij(k — 1), max — — - — - — !1^2.1 ; (64) 

n€P| Pnj 

where a n h n is the amount P n generates into q/ ln , which 
is the demand queue of P n . Also, set Wj(k) = Wj(k-l) 
for all qj £ Q l k . 

3) Output the {wj} r j =1 values. 

The following lemma shows that the above algorithm outputs 
a set of {wj}j =1 values that satisfy d57] >. 

Lemma 4: The {wj} r j = \ values generated by the above 
algorithm satisfy (|57| ). 

Proof: See Appendix E. ■ 

As a concrete example, we consider the example in Fig. 
12 with the assumption that each processor, when activated, 
consumes one unit of content from each of its supply queues 
and generates two units of contents into its demand queue. In 
this example, we see that K = 3. Thus the algorithm works 
as follows: 

1) Iteration 1, denote Q[ = {q4, q$, q$}, set u^l) = 
W5(l) = wq(1) = 1. For all other queues, set 1(^ (1) = 0. 

2) Iteration 2, denote Q l 2 = {q±, <?2, <Z3, 94, 95}, set u>i(2) = 
w 2 {2) = w 3 (2) = w 4 (2) = w 5 {2) = 2. Set w 6 (2) = 1. 

3) Iteration 3, denote <Qr 3 = {q2,qs}, set W2(3) = W3(3) = 
4. Set u>i(3) = w 4 (3) = w 5 {3) = 2, w 6 (3) = 1. 

4) Terminate and output iv\ = W4 = W5 = 2, w 2 = W3 = 
4, w e = 1. 

VIII. Simulation 

In this section, we simulate the example given in Fig. |2] In 
this example, we assume each Rj (t) is Bernoulli being or 2 
with equal probabilities. For each Pi 6 V m , i.e., Pi,P2,Ps, 
Ci(t) is assumed to be 1 or 10 with probabilities 0.3 and 
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0.7, respectively. For the output processors Pk G V°, i.e., P4 
and P5, we assume that Pk(t) — 1 or 3 with probabilities 0.6 
and 0.4, respectively. We assume that each processor, when 
activated, takes one unit of content from each of its supply 
queues and generates two units of contents into its demand 
queue (or to the output if it is an output processor). We further 
assume that all processors can be turned on without affecting 
others. Note that in this case, we have Cj(t) = for all source 
queues qj. 

It is easy to see that in this case M p = = AI{j = 2, 
Pmax = Pmin = 1. and a max = 2. Using the results in the 
above, we choose w§ = 1, Wi = w<\, = W5 = 2, W2 ~ = 4. 
We also use 9 = 6V according to ( |56l l. We simulate the PMW 
algorithm for V G {5, 7, 10, 15, 20, 50, 100}. Each simulation 
is run over 5 x 10 6 slots. 

Fig. [3] shows the utility and backlog performance of the 
PMW algorithm. We see that as V increases, the average 
utility performance quickly converges to the optimal value. 
The average backlog size also only grows linear in V. 
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Fig. 3. Utility and backlog performance of PMW. 

Fig. [4] also shows three sample path queue processes in the 
first 10 4 slots under V = 100. We see that no queue has an 
underflow. This shows that all the activation decisions of PMW 
are feasible. It is also easy to verify that the queueing bounds 
d58]l and (|59j hold for all time. 




Fig. 4. Sample path backlog processes with V = 100. 

We observe in Fig. [4] that the queue sizes usually fluctuate 
around certain fixed values. Similar "exponential attraction" 
phenomenon has been observed in prior work lfl9l . Hence our 
results can also be extended, using the results developed in 
030 , to achieve an average utility that is within 0(1/ V) of 
the optimal with only 0([log(V)] 2 ) average backlog size. In 



this case, we can also implement the PMW algorithm with 
finite buffers using the idea of floating queues in l23l . which 
works as follows: For each qj, we associate with it an actual 
buffer of size 0([log(y)] 2 ) and a counter. When contents are 
sent into the queue and the buffer is not full, we store the 
contents in the actual buffer. However, when the buffer is full 
and contents are sent to qj, these contents are dropped but 
the counter is incremented. Whereas if contents are consumed 
from qj but qj does not have enough contents, then the counter 
is decremented, and the action in that slot is assumed to be 
null. Under this method, it can be shown that the dropping and 
underflow events happen only with a very small probability. 
Hence almost all actions are valid. Thus we lose a tiny fraction 
in the utility performance, but reduce the average backlog size 
from 0(V) to 0([log(U)] 2 ). 

IX. Conclusion 

In this paper, we develop the Perturbed Max- Weight al- 
gorithm (PMW) for utility optimization problems in general 
processing networks. PMW is based on the usual Max- 
Weight algorithm for data networks. It has two main func- 
tionalities: queue underflow prevention and utility optimal 
scheduling. PMW simultaneously achieves both objectives 
by carefully perturbing the weights used in the usual Max- 
Weight algorithm. We show that PMW is able to achieve an 
[0(1/V),0(V)] utility-backlog tradeoff. The PMW algorithm 
developed here can be applied to problems in the areas of data 
fusion, stream processing and cloud computing. 

Appendix A - Proof of Theorem Q] 

We prove Theorem Q] in this section, using an argument 
similar to the one used in lfl2l . 

Proof: (Theorem[T} Consider any stable scheduling policy 
n, i.e., the conditions (1241 and (l28l i are satisfied under n. We 
let {(/(0), A(0), m(0)), (/(l), A(l), ...} be a sequence 

of (utility, arrival, service) triple generated by n. Then there 
exists a subsequence of times {Ti}»=i 2,... sucn that T, — > 
00 and that the limiting time average utility over times Tj is 
equal to the liminf average utility under n (defined by d29li). 
Now define the conditional average of utility, and arrival minus 
service over T slots to be: 

(0(«O(T);ei a4) (r);...;4")(T))4 (65) 
1 T_1 

-J2Hf(t);ei(t);...;e r (t)\S(t) = Si }, 
t=o 

where ej(t) = Aj(t) — Hj(t). Using Caratheodory's theorem, 
it can be shown, as in lfT2l that, there exists a set of variables 
{4 Si) ( T )}fcti and a set of actions {x[ Sl) (T)}^ such that: 

^(T) = J2at\T)f( Si> xi Si \T)), 

k=l 

and for all j = 1 , . . . , r that: 

r+2 

e Mp ) = J2ai Si \T)[Aj( Si ,xi Si \T))-»j( Si ,xi Si \T))]. 
fe=i 
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Now using the continuity of f(si, ■), A,(sj, •), fJ.j(si, •), and 
the compactness of all the actions sets X^ Si \ we can thus find 
a sub-subsequence Tj — )• oo of {I}}t=i,2,... that: 



4 si) (f,)^4 si \4^(T,))^^ 



(66) 



k ' 

0(«i)(fi) -> 0<«> jC (»«)Cf.) -> eJ Si) , Vj = l,...,r. (67) 

Therefore the time average utility under the policy II can be 
expressed as: 

/* = E^ (Sl) = E4 s<) /( Si ,4 s<) )- (68) 

Si Si fe=l 

Similarly, the average arrival rate minus the average service 
rate under II can be written as: 



r+2 



(69) 



fe=i 



0. 



The last inequality is due to the fact that II is a stable policy 
and that £{^ (0)} < oo, hence the average arrival rate to any 
qj must be no more than the average service rate of the queue 
l24l . However, by (f24-b we see that what is consumed from 
a queue is always no more that what is generated into the 
queue. This implies that the input rate into a queue is always 
no less than its output rate. Thus, ej > for all j. Therefore 
we conclude that ej = for all j. Using this fact and d68l l. we 
see that Vf™ v < </>*, where <fi* is given in PTT ). This proves 
Theorem Q] ■ 

Appendix B - Proof of Lemma[2] 

Here we prove Lemma [2] 

Proof: Using the queueing equation d27l i, we have: 

fo(t + l)-0i] 2 
= [(?,•(*) -^(tJ+^Cf))-^] 2 

= feK*)-**] 3 + -4/(*)) a 

-2(^t)-0,)[ W (i)-^(i)] 
< [<&(*) - + 25^* " %(*) ~ *i) M*) - M*)]- 

Multiplying both sides with ^- and summing the above over 
j = 1 , . . . , r, we see that: 

r 

L(t + 1)- L(t) <B-E WiM*) - *i) M*) - 

3=1 

where _B = (S^a. X^=i w j- N° w add to both sides the term 
-Vf(t), we get: 

L(t + 1) - L(t) - Vf(t) < B — Vf(t) 



(70) 



Taking expectations over the random network state S(t) on 
both sides conditioning on q(t) proves the lemma. ■ 



Appendix C - Proof of Theorem [2] 

Here we prove Theorem [2] We first have the following 
simple lemma. 
Lemma 5: For any network state Si, we have: 

i( s i)* 



D 



, 9(t) -g Si ((q(t)-6)®w), 



(71) 



where w = (wi, w r ) T . 

Proof: By comparing (l44l with (|39l , we see that the 
lemma follows. ■ 

Proof: (Theorem [2]) We first recall the equation ( TTOb as 
follows: 



L(t + l)-L(f)-V/(t)<B-^/(t) 



(72) 



- e^- (<&(*)- wi- 

3=1 

Using DgyUj (x) defined in d43"T >. this can be written as: 

L(t + 1) - L(t) Vf(t) <B- D%§»(x(t)). 

Here x(t) is PMW's action at time t. According to Condition 
[1] we see that for any network state S(t) = Sj, PMW ensures 
d24b , and that: 

Using ( TTT~b , this implies that under PMW, 

L(t +1)-L(t)- Vf(t) < B — g Si ((q(t) -6)®w) + C. 

Taking expectations over the random network state on both 
sides conditioning on q(t), and using d40l i. i.e., 17(7) = 
E Sl ^^9s, h), we get: 

A(i) - VE{f(t) I q(t)} <B + C- g((q(t) - 6) ® to). (73) 
Now using Theorem Q] and Lemma [T] we have: 

Vfav < <f>* < 9(1*) < 9{W) -6)® w). 
Therefore, 

A(*) - VE{f(t) I q(t)} <B + C- Vf* av . (74) 

Taking expectations over q(t) on both sides and summing the 
above over t = 0, T — 1, we get: 

T-l 

E{L(T) - L(0)} - E VTE{/(t)} < T(B + C) - TVf* av . 
t=o 

Rearranging terms, dividing both sides by VT, using the facts 
that L(t) > and E{L(0)} < 00, and taking the liminf as 
T — > 00, we get: 

C MW > ttv ~(B + C)/V. (75) 

This proves d45l l. Now we prove d46l ). First, by us- 
ing the definition of g(j) in d38l ). and plugging in the 
{Xfr , 4 S m" 2 variables in the ^-slackness assump- 

tion d25l l in Section HV-BI we see that: 



g((q(t) -0)®w)> r?E «>ifaj(*) - j] - (76) 
3=1 
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This by Lemma [TJ implies that: 

r 

g((q(t) - 0) ® to) > f^iOjMt) - Oj] - VS ma: 

Using this in (|73l l, we get: 

A(t) - VE{/(t) | q(t)} <B + C + V5 max 

r 

0=1 

We can now use a similar argument as above to get: 

T-l r 



*=0 J=l 



< 



T(fl + C) + 2TVS max + E{L(0)}. 



Dividing both sides by r)T and taking the limsup as T — > oo, 
we get: 



qPMW B + C + 2VS max ^ 



'/ 



This completes the proof the theorem. 



3 = 1 



Appendix D - Proof of Lemma[3] 

Here we prove Lemma [3] by comparing the values of the 
three terms in D g ^)l(x) in ( BIT ) under PMW versus their 

values under the action that maximizes Dg^)}(x) in ( Bil l 
subject to only the constraints D 3 \t) G [0,1], Vj G Q s and 
J(t) G X, called the max-action. That is, under the max-action, 

D e S q(t]( x ) = D S q(f)*( x )- Note that the max-action differs 
from PMW only in that it does not consider the queue edge 
constraint. 

Proof: (A) We see that the first term, i.e., 

- Ejec [ Fc j(*) + w M f ) - 0j\Dj(t)Rj(t) is maximized 
under PMW. Thus its value is the same as that under the 
max-action. 

(B) We now show that for any processor P n G V, if it 
violates the queue edge constraint, then its weight is bounded 
by MpWmaxVmaxPmax- This will then be used in Part (C) 
below to show that the value of D^y)(x) under PMW is 
within a constant of D g S ^))* (x) under the max-action. 

(B-I) For any Pi G V m , the following are the only two 
cases under which Pi violates the queue edge constraint. 

1) Its demand queue qh(t) > 9. In this case, it is easy to 
see from (l53l l and (|6TT > that: 



IT' 



(in) 



(t) < 2J W 3 U maxftij < M p W. 



max " max >~ J raax 



ft,, 



(77) 



2) One of Pj's supply queue has a queue size less than 
M q ft max . In this case, we denote Qf = {qj G Qf : 



Ijif) > Mqftmax}- Then we see that: 



j'eQf/Qf 

< 2J WjVmaxPij + Wh.0a.ih 



jeQf 



w 3 {M s q f3 max - 6)p i: 



jo- 



Here = Qf. Now by our selection of {wj}j =1 , 
Wjftij > u^ay, for any qj G Qf . Also using v max > 



M s q ft max , we have: 

r(in). 



(78) 



(^) — Mp^maxL / 7naxftmax- 

(B - II) For any Pk G "P°, we see that it violates the queue 
edge constraint only when one of its supply queues has size 
less than M q ft max . In this case, we see that: 

Wt°\t) < J2 w 3 {qj{t) - 6)p kj + V Pk {t)a ko 
+ Yj w j {M s q ft max -e)ft l] 

— MpWmax^max ftrnax ~T~ V^maxPrnax ^min^^min- 



This by j56b implies that: 



(^) — MpWmax^max $ma 



(79) 

Using (l77l i. (l78l and (l79l . we see that whenever a processor 
violates the queue edge constraint, its weight is at most 

Mp W max l^rnax ft max • 



(C) We now show that the value of Dq^)1{x) under 



PMW satisfies D 



e, q {ty ■ x > — 0,q(t) W 



,(S(t))* 



C, where C 



NpA / lpW max h , rnax ft niax . 

To see this, let I* (t) be the activation vector obtained by the 
max-action, and let W*{t) be the value of d55l l under I*(t). 
We also use I PMW ' (t) and W PMW (t) to denote the activation 
vector chosen by the PMW algorithm and the value of (|55| > 
under I PMW (t). We now construct an alternate activation 
vector I(t) by changing all elements in I*(t) corresponding to 
the processors that violate the queue edge constraints to zero. 
Note then I(t) G T is a feasible activation vector at time t, 
under which no processor violates the queue edge constraint. 
By Part (B) above, we see that the value of $55[ under I(t), 
denoted by W(t), satisfies: 

W{t) > W*{t) - NpM p w max v m 

Now since I PMW (t) maximizes the value of (l55l l under the 
queue edge constraints, we have: 



W 



PMW 



(*) 



> 



W(t) 



> W*{t) - NpW max M p v max p n 



Thus, by combining the above and Part (A), we see 
that PMW maximizes the Dftyl (x) to within C = 
NpMpW max v max ft m ax of the maximum. ■ 
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Appendix E - Proof of Lemma|4] 

Proof: (Proof of Lemma |4]i The proof consists of two 
main steps. In the first step, we show that the algorithm updates 
each wj value at least once. This shows that all the Wj values 
for all the queues that serve as demand queues are updated 
at least once. In the second step, we show that if is the 
demand queue of a processor Pi G P™, then every time after 
Wh is updated, the algorithm will also update wj for any qj 6 
Qf before it terminates. This ensures that ( f57T > holds for any 
Pi G V m and hence proves the lemma. 

First we see that after K iterations, we must have Q C 
^t=iQt- This is because at Iteration k, we include in U^ =1 Q^- 
all the queues starting from which there exists a path to an 
output processor that contains k processors. Thus all the Wj 
values are updated at least once. 

Now consider a queue q^. Suppose is the demand queue 
of a processor Pj G P m . We see that there exists a time 
k < K at which Wh is last modified. Suppose Wh is last 
modified at Iteration k < K, in which case q^ G Q~. Then 
all the queues qj G Qf will be in Qi 1> Thus their Wj values 

will be modified at Iteration k + 1 < K . This implies that 
at Iteration k + 1, we will have Wj(k + 1)/% > Wh(k)a.ih. 
Since q^ ^ Q l k for k > k + 1, we have Wh(k) = Wh(k) for all 
k > k + 1. Therefore Wj(k)/3ij > Wh(k)aih V fc+1 < k < K, 
because Wj(k) is not decreasing. 

Therefore the only case when the algorithm can fail is when 
Wh is updated at Iteration k = K, in which case Wh may 
increase but the Wj values for qj G Qf are not modified 
accordingly. However, since wu is updated at Iteration k = K, 
this implies that there exists a path from q% to an output 
processor that has K processors. This in turn implies that 
starting from any qj G Qf , there exists a path to an output 
processor that contains K + 1 processors. This contradicts the 
definition of K. Thus the lemma follows. ■ 
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